A hybrid approach for gene selection and classification using support vector machine
نویسندگان
چکیده
Deoxyribo Nucleic Acid (DNA) microarray technology allows us to generate thousands of gene expression in a single chip. Analyzing gene expression data plays vital role in understanding diseases and discovering medicines. Classification of cancer based on gene expression data is a promising research area in the field of bioinformatics and data mining. All genes do not contribute for efficient classification of samples. Hence, a robust feature selection method is required to identify the relevant genes which help in the classification of samples effectively. Most of the existing feature selection methods are computationally expensive. Redundancy in gene expression data leads to poor classification accuracy and also acts bad on multi class classification. This paper proposes an ensemble feature selection technique which is a combination of Recursive Feature Elimination (RFE) and Based Bayes error Filter (BBF) for gene selection and Support Vector Machine (SVM) algorithm for classification. The proposed ensemble gene selection method yields comparable performance on classification when compared to existing classifiers and provides a new insight in feature selection.
منابع مشابه
Modeling and design of a diagnostic and screening algorithm based on hybrid feature selection-enabled linear support vector machine classification
Background: In the current study, a hybrid feature selection approach involving filter and wrapper methods is applied to some bioscience databases with various records, attributes and classes; hence, this strategy enjoys the advantages of both methods such as fast execution, generality, and accuracy. The purpose is diagnosing of the disease status and estimating of the patient survival. Method...
متن کاملFeature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine
We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...
متن کاملIdentification of Alzheimer disease-relevant genes using a novel hybrid method
Identifying genes underlying complex diseases/traits that generally involve multiple etiological mechanisms and contributing genes is difficult. Although microarray technology has enabled researchers to investigate gene expression changes, but identifying pathobiologically relevant genes remains a challenge. To address this challenge, we apply a new method for selecting the disease-relevant gen...
متن کاملFast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets
Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...
متن کاملAPPLICATION OF THE HYBRID HARMONY SEARCH WITH SUPPORT VECTOR MACHINE FOR IDENTIFICATION AND CALSSIFICATION OF DAMAGED ZONE AROUND UNDERGROUND SPACES
An excavation damage zone (EDZ) can be defined as a rock zone where the rock properties and conditions have been changed due to the processes related to an excavation. This zone affects the behavior of rock mass surrounding the construction that reduces the stability and safety factor and increase probability of failure of the structure. This paper presents an approach to build a model for the ...
متن کاملSustainable Supplier Selection by a New Hybrid Support Vector-model based on the Cuckoo Optimization Algorithm
For assessing and selecting sustainable suppliers, this study considers a triple-bottom-line approach, including profit, people and planet, and regards business operations, environmental effects along with social responsibilities of the suppliers. Diverse metrics are acquainted with measure execution in these three issues. This study builds up a new hybrid intelligent model, namely COA-LS-SVM, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. Arab J. Inf. Technol.
دوره 12 شماره
صفحات -
تاریخ انتشار 2015